AITopics | model function

Collaborating Authors

model function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

c2c701fe341a7756ca7fd4eaa83ff63f-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 01:01:44 GMT

complexity, model-based method, optimization, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

On Linear Stability of SGD and Input-Smoothness of Neural Networks

Neural Information Processing SystemsDec-24-2025, 10:51:47 GMT

The multiplicative structure of parameters and input data in the first layer of neural networks is explored to build connection between the landscape of the loss function with respect to parameters and the landscape of the model function with respect to input data. By this connection, it is shown that flat minima regularize the gradient of the model function, which explains the good generalization performance of flat minima. Then, we go beyond the flatness and consider high-order moments of the gradient noise, and show that Stochastic Gradient Dascent (SGD) tends to impose constraints on these moments by a linear stability analysis of SGD around global minima. Together with the multiplicative structure, we identify the Sobolev regularization effect of SGD, i.e.

linear stability, name change, sgd and input-smoothness, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback

Dualing GANs

Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel

Neural Information Processing SystemsNov-21-2025, 04:51:10 GMT

It is however well known that GAN training suffers from instability due to the nature of its saddle point formulation. In this paper, we explore ways to tackle the instability problem by dualizing the discriminator.

artificial intelligence, discriminator, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Illinois (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers

Shaw, Peter, Cohan, James, Eisenstein, Jacob, Toutanova, Kristina

arXiv.org Artificial IntelligenceSep-30-2025

The Minimum Description Length (MDL) principle offers a formal framework for applying Occam's razor in machine learning. However, its application to neural networks such as Transformers is challenging due to the lack of a principled, universal measure for model complexity. This paper introduces the theoretical notion of asymptotically optimal description length objectives, grounded in the theory of Kolmogorov complexity. We establish that a minimizer of such an objective achieves optimal compression, for any dataset, up to an additive constant, in the limit as model resource bounds increase. We prove that asymptotically optimal objectives exist for Transformers, building on a new demonstration of their computational universality. We further show that such objectives can be tractable and differentiable by constructing and analyzing a variational objective based on an adaptive Gaussian mixture prior. Our empirical analysis shows that this variational objective selects for a low-complexity solution with strong generalization on an algorithmic task, but standard optimizers fail to find such solutions from a random initialization, highlighting key optimization challenges. More broadly, by providing a theoretical framework for identifying description length objectives with strong asymptotic guarantees, we outline a potential path towards training neural networks that achieve greater compression and generalization.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.22445

Genre:

Research Report (0.82)
Overview (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (1.00)

Add feedback

c2c701fe341a7756ca7fd4eaa83ff63f-Paper.pdf

Neural Information Processing SystemsAug-17-2025, 05:49:33 GMT

artificial intelligence, machine learning, optimization, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

On Linear Stability of SGD and Input-Smoothness of Neural Networks

Neural Information Processing SystemsAug-15-2025, 19:40:36 GMT

This is relevant when the learning rate is not very small.

artificial intelligence, machine learning, training data, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On Linear Stability of SGD and Input-Smoothness of Neural Networks

Neural Information Processing SystemsJan-15-2025, 11:44:14 GMT

linear stability, neural network, sgd and input-smoothness, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Dualing GANs

Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel

Neural Information Processing SystemsOct-2-2024, 19:17:56 GMT

Generative adversarial nets (GANs) are a promising technique for modeling a distribution from samples. It is however well known that GAN training suffers from instability due to the nature of its saddle point formulation. In this paper, we explore ways to tackle the instability problem by dualizing the discriminator. We start from linear discriminators in which case conjugate duality provides a mechanism to reformulate the saddle point objective into a maximization problem, such that both the generator and the discriminator of this'dualing GAN' act in concert. We then demonstrate how to extend this intuition to non-linear formulations. For GANs with linear discriminators our approach is able to remove the instability in training, while for GANs with nonlinear discriminators our approach provides an alternative to the commonly used GAN training algorithm.

discriminator, gan, objective, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Illinois (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Notes on Kalman Filter (KF, EKF, ESKF, IEKF, IESKF)

Im, Gyubeom

arXiv.org Artificial IntelligenceJun-27-2024

The Kalman Filter (KF) is a powerful mathematical tool widely used for state estimation in various domains, including Simultaneous Localization and Mapping (SLAM). This paper presents an in-depth introduction to the Kalman Filter and explores its several extensions: the Extended Kalman Filter (EKF), the Error-State Kalman Filter (ESKF), the Iterated Extended Kalman Filter (IEKF), and the Iterated Error-State Kalman Filter (IESKF). Each variant is meticulously examined, with detailed derivations of their mathematical formulations and discussions on their respective advantages and limitations. By providing a comprehensive overview of these techniques, this paper aims to offer valuable insights into their applications in SLAM and enhance the understanding of state estimation methodologies in complex environments.

bel, equation, kalman filter, (14 more...)

arXiv.org Artificial Intelligence

2406.06427

Country: Europe > Germany > Baden-Württemberg > Freiburg (0.04)

Genre:

Research Report (0.40)
Overview (0.34)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

A Margin-based Multiclass Generalization Bound via Geometric Complexity

Munn, Michael, Dherin, Benoit, Gonzalvo, Javier

arXiv.org Machine LearningMay-28-2024

There has been considerable effort to better understand the generalization capabilities of deep neural networks both as a means to unlock a theoretical understanding of their success as well as providing directions for further improvements. In this paper, we investigate margin-based multiclass generalization bounds for neural networks which rely on a recent complexity measure, the geometric complexity, developed for neural networks. We derive a new upper bound on the generalization error which scales with the margin-normalized geometric complexity of the network and which holds for a broad family of data distributions and model classes. Our generalization bound is empirically investigated for a ResNet-18 model trained with SGD on the CIFAR-10 and CIFAR-100 datasets with both original and random labels.

complexity, geometric complexity, inequality, (10 more...)

arXiv.org Machine Learning

2405.1859

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback